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Abstract. As computation spreads from computers to networks of computers, and migrates into cyberspace, 
it ceases to be globally programmable, but it remains programmable indirectly and partially: network com- 
putations cannot be controlled, but they can be steered by imposing local constraints on network nodes. 
The tasks of "programming" global behaviors through local constraints belong to the area of security. The 
"program particles" that assure that a system of local interactions leads towards some desired global goals 
are called security protocols. They are the software connectors of modern, world wide software systems. 

As computation spreads beyond cyberspace, into physical and social spaces, new security tasks and prob- 
lems arise. As computer networks are extended by nodes with physical sensors and controllers, including the 
humans, and interlaced with social networks, the engineering concepts and techniques of computer security 
blend with the social processes of security, that evolved since the dawn of mankind. These new connectors 
for computational and social software require a new "discipline of programming" of global behaviors through 
local constraints. Since the new discipline seems to be emerging from a combination of established models 
of security protocols with older methods of procedural programming, we use the name procedures for these 
new connectors, that generalize protocols. 

In the present paper we propose actor-networks as a formal model of computation in heterogenous net- 
works of computers, humans and their devices, where these new procedures run; and we introduce Procedure 
Derivation Logic (PDL) as a framework for reasoning about security in actor- networks. On the way, we 
survey the guiding ideas of Protocol Derivation Logic (also PDL) that evolved through our work in security 
in last 10 years. Both formalisms are geared towards graphic reasoning and, ultimately, tool support. We 
illustrate their workings by analysing a popular form of two-factor authentication, and a multi-channel device 
pairing procedure, devised for this occasion. 

1. Introduction 

1.1. Motivation and background 

In [5S] we pondered about the "unreasonable ineffectiveness of security engineering", and suggested that 
one of the main causes was that the widely used methods for pervasive software design were low level. The 
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age 


ancient times 


middle ages 


modern times 


platform 


computer 


operating system 


network 


applications 


Quieksort, eompilers 


MS Word, Oracle 


WWW, botnets 


requirements 


correctness, termination 


liveness, safety 


integrity, confidentiality 


tools 


programming languages 


specification languages 


scripting languages 



Table 1. Paradigms of computation 

high level methodologies to specify and design reusable software procedures were not lifted from traditional 
computer systems to modern network based computation. In Section IV of [SS], we provided a sketch of a 
network computation model that might fill the gap, and be used as a high level tool to specify and analyze 
network procedures. But the sketch was very crude, and we did not even have space to provide any examples 
of network procedures. So the final part of that story remained rather obscure. We attempt to rectify that 
in the present paper. 

1.2. Context 

1.2.1. The ages of software 

In the beginning, engineers built computers, and wrote programs to control computations. The platform of 
computation was the computer, and it was used to execute algorithms and calculations, allowing people to 
discover, e.g., fractals, and to invent compilers, that allowed them to write and execute more algorithms 
and more calculations more efficiently. Then the operating system became the platform of computation, and 
software was developed on top of it. The era of personal computing and enterprise software broke out. And 
then the Internet happened, followed by cellular networks, and wireless networks, and ad hoc networks, and 
mixed networks. Cyber space emerged as the distance-free space of instant, costless communication, where 
any pair of network nodes is directly connected. Nowadays software is developed to run in cyberspace. The 
Web is, strictly speaking, just a software system, albeit a formidable one. A botnet is also a software system. 
As social space blends with cyber space, many social (business, collaborative) processes can be usefully 
construed as software systems, that ran on social networks as hardware. Many social and computational 
processes become inextricable. Table[T]gives a crude picture of the paradigm shifts that led to this remarkable 
situation. 

But as every person got connected to a computer, and every computer to a network, and every network to 
a network of networks, computation became interlaced with communication, and ceased to be programmable. 
The functioning of the Web and of web applications is not determined by the code in the same sense as in a 
traditional software system: after all, web applications do include the human users as a part of their runtime. 
The fusion of social and computational processes in cyber-social space leads to a new type of information 
processing, where the purposeful program executions at the network nodes are supplemented by spontaneous 
data-driven evolution of network links. While the network emerges as the new computer, data and metadata 
become inseparable, and new types of security problems arise. 

1.2.2. The ages of software security 

In early computer systems, security tasks mainly concerned sharing of the computing resources. In computer 
networks, security goals expanded to include information protection. Both computer security and information 
security essentially depend on a clear distinction between the secure areas, and the insecure areas, separated 
by a security perimeter. Security engineering caters for computer security and for information security by 
providing the tools to build the security perimeter. In cyber space, the secure areas are separated from 
the insecure areas by the "walls" of cryptography; and they are connected by the "gates" of cryptographic 
protocols. 

But as networks of computers and devices spread through physical and social spaces, the distinctions 
between the secure and the insecure areas become blurred. With network computation, the software-hardware 
distinction acquires a new meaning. In contrast with the purposefully built and programmed electronic 
computers, the new spontaneously evolving computer-as-a-network includes social networks as a part of its 
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age 


middle ages 


modern times 


postmodern times 


space 


computer center 


cyber space 


cyber-social space 


assets 


computing resources 


information 


public and private resources 


requirements 


availability, authorization 


integrity, confidentiality 


trust, privacy 


tools 


locks, tokens, passwords 


cryptography, protocols 


mining and classification 



Table 2. Paradigms of security 

hardware, while social processes are becoming a part of its software. To follow these developments, computer 
science is endorsing themes and tools of social sciences |5ni[lS], while social sciences are increasingly concerned 
with computation |67[ [7] . The formalism of actor-networks arises on this background. 

1.3. Goals and ideas 

Our goal is to contribute towards a formal framework for for reliable and practical reasoning about compu- 
tation and communication in networks. Since network computation involves adversarial behaviors, security 
stands out as the central concern in network computation. But reliable reasoning about security, even in the 
familiar end-to-end networks, usually requires complicated models. Hence the tension between the require- 
ments of reliability and of practicality: reliable reasoning requires a precise formal model, but formal models 
of security tend to be impractically complex. One approach is to mitigate this complexity through auto- 
mated support. We have been studying this solution for many years [151 H?! [551 [3] , and it has been broadly 
supported in the research community [H [51 [51 [T3] . Another approach is to try to decrease the complexity 
through model abstraction and refinement, and through search for convenient and intuitive notations. In 
the present work, we put more emphasis on this second approach, building upon our previous attempts 
[48l [TTl [56] to extend to reasoning about security the incremental modeling methodologies, well established 
in software engineering. Towards this goal, we draw our formal models from the informal reasoning practices, 
and attempt to make them mathematically precise, while trying to keep them as succinct and intuitive as 
possible. The main feature of our formalism is that it provides support for diagrammatically based security 
proofs. Although they cannot be directly automated, these proofs are as formal and as precise as the proofs 
in similar diagrammatic formalisms across mathematics, which also cannot be directly automated. After all, 
very few of the formal proofs presented in mathematics papers and textbooks are "formal enough" to be 
entered into a theorem prover. The hope is, however, that in the end, the two types of security models, those 
designed for software tools, and those designed for human consumption, will converge into a theory that will 
allow automating complex arguments, while resolving some complexities through insightful notations. 

An important instrument of user-friendly mathematical formalisms is the "syntactic sugar" , where we 
subsume a whole gamut of notational and graphical abbreviations, conventions and abuses. Although unsound 
informal reasoning can be a source of many troubles, and the pedagogical emphasis is usually placed squarely 
against it, sound informal reasoning can be a source of many insights. After all, the formal proofs and 
constructions are seldom born fully shaped and formalized, but begin their life as insights and ideas. A 
useful formalism supports such transformations from insights into formal proofs. The soundness of such 
transformations often depends on a natural selection of notational conventions and abuses. Not entirely 
unintentionally, our formalisms turn out to be rich in graphic and syntactic sugar, and in sound notational 
abuses. We begin by explaining some terminological abuses, starting from the first three words of the title. 

1.3.1. "Actor-network" 

Networks have become an immensely popular model of computation across sciences, from physics and biology, 
to sociology and computer science [511 [10] • Actor-networks |39] are a particularly influential paradigm 
in sociology, emphasizing and analyzing the ways in which the interactions between people and objects, as 
equal factors, drive social processes, in the sense that most people cannot fly without an airplane; but that 
most airplanes also cannot fly without people. Our goal in the present paper is to formalize and analyze some 
security processes in networks of people, computers, and the ever expanding range of devices and objects 
used for communication and networking, blurring many boundaries. The idea that people, computers, and 
objects are equal actors in such networks imposed itself on us, through the need for a usable formal model, 



4 



D. Pavlovic and C. Meadows 



even before we had heard of the sociological actor-network theory. After we heard of it, we took the liberty of 
adopting the name actor-network for a crucial component of our mathematical model, since it conveniently 
captures many relevant ideas. While the originators of actor-network theory never proposed a formal model, 
we believe that the tasks, methods and logics that we propose are not alien to the spirit of their theory. 
In fact, we contend that computation and society have pervaded each other to the point where computer 
science and social sciences already share their subject. 

It should be noted, though, that the goals of this work are completely different from the goals of sociology 
of actor-networks, and that our actor-network formalism deviates from the original ideas in a substantial way, 
even by being a formalism. We make no claims or attempts to faithfully interpret any of the actor-network 
authors; but we remain faithful to the spirit of their endeavor, since they all discourage orthodoxy. 

1.3.2. "Procedures" 

In computer programs, frequently used sequences of operations are encapsulated into procedures, also called 
routines. A procedure can be called from any point in the program, and thus supports reuse of code. 

In computer networks, frequently used sequences of operations are specified and implemented as network 
protocols, or as cryptographic protocols. So protocols are, in a sense, network procedures. Conceptually, if 
not technically, protocol analysis can thus be viewed as an extension of the venerable science of program 
semantics, and of the methods of procedural programming, adapted for the purposes of network computation. 

Beyond computer networks, there are now hybrid networks, where besides computers with their end- 
to-end links, there may be diverse devices, with their heterogenous communication channels, cellular, short 
range etc. Online banking and other services are nowadays usually secured by two-factor and multi-factor 
authentication, combining passwords with smart cards, or cell phones. A vast area of multi- channel and out- 
of-band protocols opens up, together with the web service choreographies and orchestrations; and we have 
only scratched its surface. And then there are of course also social networks, where people congregate with 
their phones, their cameras and their smiling faces, and overlay the wide spectrum of their social channels 
over computer networks and hybrid networks. Many sequences of frequently used operations within these 
mixed communication structures have evolved. This is what we call actor-network procedures. 

Conceptually, actor-network procedures extend program procedures from computers to networks; and 
they furthermore extend network protocols, and multi-channel protocols, and web service choreographies 
and orchestrations, into social networks. 

Technically, actor-network procedures are, of course, immensely more complicated than their conceptual 
relatives in computer science, because humans use many types of physical resources, communicate through 
many parallel communication channels, with many levels of encoding interleaved over each other. If we 
ignore computers and devices, actor-network procedures already capture the main complexities of social life, 
as actor-network theorists are explaining. They are a sociologists' problem. In any case, they are completely 
out of reach for computer scientists' protocol models, symbolic, information theoretic, and computational, 
all designed for simple end-to-end communication. So a computationally minded reader may wonder why 
bother to bring them up here. 

The reason is that the most frequent transactions that we engage with our computers and even among 
ourselves involve actor-network procedures. Online banking and shopping, as well as the checkout in the 
supermarket, travel search on the web, as well as the security line on the airport involve actor-network 
procedures. Time and again, it has been recognized and reconfirmed that most security breaches nowadays 
occur not through cryptanalysis, and not through buffer overflow, but through various forms of "social 
engineering" and channel interactions. Yet "social engineering" has remained a marginal note in technical 
research. An effort to change this may be foolhardy, but it leads to ideas and structures that seem interesting 
to explore — even independently of their utility. 

1.4. Related work 

1.4- 1- The expanding concept of a protocol 

In social and computational networks, procedures come in many flavors, and have been studied from many 
angles. Besides cryptographic protocols, used to secure end-to-end networks, in hybrid networks we increas- 
ingly rely on multi-channel protocols ^64^, including device pairing [37]. In web services, standard procedures 
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come in two flavors: choreographies and orchestrations [BD]- There are, of course, also social protocols and 
social procedures, which were developed and studied first, although not formally modeled. As social net- 
works are increasingly supported by electronic networks, and on the Web, social protocols and cryptographic 
protocols often blend together. Some researchers have suggested that the notion of protocol should be ex- 
tended to study such combinations [9, 27, 35^. On the other side, the advent of ubiquitous computing has 
led to extensive, careful, but largely informal analyses of the problems of device pairing, and of security 
interactions of using multiple channel types [64l |34l EI] • One family of the device pairing proposals has been 
systematically analyzed in the computational model in [S51 [531 HOI EI] ■ 

1.4-2. Protocol logics and graphics 

There is a substantial and extremely successful body of research on the formal specification and verification 
of security protocols. As we have remarked, it is largely geared to supporting sound and efficient mechanisms 
for specification and verification, while considerably less attention has been paid to approaches that support 
the user's understanding of the structure of a protocol and how it contributes to its security. There have been 
some notable exceptions, however. In this section we describe the work in this direction that has contributed 
to our own efforts. 

One of the most successful, and in our opinion most interesting formal methods for reasoning about 
security protocols are strand spaces [29 . Among its many salient features, the convenient diagrammatic 
protocol descriptions were an important reason for its wide acceptance and popularity. It is important to 
note that the strand space diagrams are not just an intuitive illustration, but that they are formal objects, 
corresponding to precisely defined components of the theory, while on the other hand closely resembling the 
informal "arrows-and-messages" protocol depictions, found in almost every research paper and on almost 
every white board where a protocol is discussed. 

Protocol Composition Logic (PCL) was, at least in its early versions [25l [H] [H] [24l [l7] , an attempt to 
enrich the strand model with a variable binding and scoping mechanism, making it into a process calculus 
with a formal handle on data fiows, which would thus allow attaching Floyd-Hoare-style annotations to 
protocol executions, along the lines of [551 (SH]- This was necessary for incremental refinement of protocol 
specifications, and for truly compositional, and thus scalable protocol analyses, which were the ultimate goal 
of the project. Unfortunately, with these extensions, the handy diagrammatic notation of the strand model 
got lost. 

Protocol Derivation Logic (PDL) has been an ongoing effort |48[ [TTl [551 [31 [HI [57] towards a scalable, 
i.e. incremental protocol formalism, allowing composition and refinement like PCL, but equipped with an 
intuitive and succinct diagrammatic notation, like strand spaces. The belief that these two requirements can 
be reconciled is based on the observation that the reasoning of protocol participants is concerned mostly 
with the order of events in protocol executionsl3 It follows that the protocol executions and their logical 
annotations both actually describe the same structures, which can be viewed as partially ordered multisets 
[62| . and manipulated within the same diagrammatic language. This has been the guiding idea of PDL. 
Several case studies of standard protocols, and the taxonomies of the corresponding protocol suites, have 
been presented in (351 HH [55] [3] . An application to a family of distance bounding protocols has been presented 
in [33] ; and an extension supporting the probabilistic reasoning necessary for another such family has been 
proposed in |57l. In the present paper, we propose the broadest view of PDL so far — which should here 
be read as Procedure Derivation Logic. The underlying process model is enriched by a new network model, 
to support reasoning about network procedures. The logical annotations are extended accordingly — still 
geared towards the diagrammatic reasoning, which still seems like a reasonable strategy, since principals' 
reasoning towards security remains largely concerned with order of actions. 

1.4-3. Computational soundness? 

Security is an old social process, but it is a relatively new technical problem. As many other new problems, 
it often looks unreasonably complicated: a couple of lines of a protocol can conceal a subtle problem for 
many years. This is sometimes mentioned as the characterizing feature of the field of security. A direct 



^ E.g., in order to achieve mutual authentication, each participant of a run must be able to prove that their and their peers' 
actions in their conversation must have happened exactly in the order prescribed by the protocol: i.e., that the received messages 
were previously sent as claimed, and vice versa. 
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analysis leads to convoluted reasoning, often with an exponential explosion of the cases to be considered. 
As mentioned in Sec. 11.31 this naturally leads to the idea of automated reasoning [36j, which we pursued 
through several different frameworks [ISl |3S1 131 [2E] ■ As this idea came to be a widely accepted, it led to 
a convergence of research to a small number of standard formalisms, based on a sharp division between 
the symbolic and the computational models of security. This division originally started from the empiric 
observation, that motivated the seminal paper [T], that the extant security research was broadly based 
on two different models. Since the computational model is more precise, whereas the symbolic model is 
easier to use, the way to get the best of both worlds is to demonstrate that the proofs in the simpler model 
remain valid in the more precise model; in other worlds, that the symbolic model was computationally sound. 
This approach led to many important results and useful tools [SI IH HI] • But the focus on computational 
soundness led some researchers to begin viewing all formal models of security as approximations of the 
standard computational model, forgetting that all models are approximations of some real processes. In 
the ensuing confusion, even the models involving the features that ostensibly go beyond the computational 
model (such as timed channels [571) were required to demonstrate their computational soundness. This is, 
of course, a meaningless requirement, since a model can only be sound or unsound with respect to a model 
where it can be faithfully interpreted. 

Our current effort is again of this type, as the actor-network model includes several features that preclude 
computational interpretations, and render the question of its computational soundness meaningless. One 
such feature is the fact that the actors needn't be standard computers: in this paper, we will see networks 
involving humans, with their free will; but they could also be, e.g. ants, drawing unusual computational 
powers from their pheromones [H]. Indeed, a framework that attempts to capture some social interactions, 
as announced in the title of this paper, can not be captured by a purely computational model, and thus can 
hardly be expected to be computationally sound. On the other hand, even if we accept to simulate all actors 
by Turing machines, including ants and humans, the resulting actor-network will still not boil down to the 
computational model, since the diverse communication channels, that can be specified in an actor-network, 
cannot be reduced to interactions of Turing machines. This is further discussed in Sec. [2] 

Undoubtedly, the fact that our model cannot be proven computationally sound, or even given a compu- 
tational interpretation, can be interpreted as the evidence that we are modeling what cannot be modeled; 
whereas the fact that our proofs cannot be automated can be viewed as the evidence that they are not 
completely formal. Both interpretations are true, for some suitable meanings of the words "formal" , and 
"model" . Instead of arguing whether these meanings are reasonable or not, we present our formal model. 
We contend that our actor-network based proofs in Procedure Derivation Logic are as rigorous as the proofs 
in any of the standard mathematical formalisms; and hopefully somewhat insightful for the reader. In par- 
ticular, our diagrammatic proofs can be viewed as a formal method similar to diagram chasing in category 
theory [42[ I54j , from which we drew inspiration. 



Outline of the paper 

Sec. [5] introduces the formal model of actor-networks. Sec [3] explains how actor-networks compute, and 
introduces the formalisms to represent that computation, all the way to actor-network procedures. Sec. S] 
presents Procedure Derivation Logic (PDL) as a method for reasoning about actor-network procedures. In 
Sec.[5]we provide the first case studies using PDL: we analyze the two- factor authentication in online banking, 
and a device pairing procedure combining physical and biometric channels. Sec. H] contains a discussion of 
the results and the future work. 



2. Actor-network model 

2.1. A computer is a network is a computer 

The standard model of computation is a Turing machine. It uses a tape as a storage medium. Sometimes 
additional tapes are used to represent the input and the output interfaces. Probabilistic Turning machines 
also read some random strings from a devoted tape, whereas oracle Turing machines communicate with the 
oracle through an additional tape. Last but not least, interactive computation is often modeled using several 
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Turing machines that interact with each other on joint tapes. Such joint tapes are actuaUy the communication 
channels between the Turing machines. 

Interactive Turing machines can be viewed as a computational network, with the machines as nodes, 
and the joint tapes as hnks between them. However, pervasive networks that compute in the world around 
us nowadays include a wide variety of computational agents at their nodes, including humans, and the 
various devices with different computational powers. Luckily, an abstract view of the nodes suffices for most 
analyses. We usually just need to specify that the state, i.e. the variables where the node can store its data; 
and we postulate which computations can be effectively performed by the node, and which computations 
are unfeasible. 

The channels between the nodes can be viewed as a generalization of the joint tapes, shared by Turing 
machines in their interactions. But while the joint tapes trivially pass information from one machine to 
the other, nontrivial channels perform nontrivial data transformations. E.g., a pair of nodes (which may 
or may not be Turing machines) can be connected by a noisy binary channel, flipping each bit with a 
certain probability while passing it from one node to the other. Another pair of computational agents can 
be connected by a cyber channel, controlled by an adaptive attacker, who can change and modify the data 
flows at will. In symbolic protocol analysis, such attackers are usually specified as simple state machines. 
In cryptography, they are usually modeled as probabilistic polynomial-time Turing machines. In network 
models, it is convenient and intuitive to view them as processes encapsulated in channels. 

In summary, the simplest model of network computation consists of 

• computational agents, some of them controlled by various parties, others available as resources; and 

• communication channels between the agents, supporting different types of information flows. 



2.2. Idea of actor-networks 

Actor-networks depict social processes as computations, and computation as a social process. An example of 
an actor- network is a configuration consisting of a musician and her instrument. Their intended interaction 
is the music. The process and the result are highly structured, and the network representation helps with 
the analysis. The network first of all displays the symmetry of this interaction: the musician cannot play 
without the instrument, and the instrument cannot play without the musician. The fact that some musician's 
instrument may be a part of her body (e.g., her voice), and that some instrument's musician may be a 
computer makes the picture only more interesting. A smart card and a smart card reader form another 
network of this type. But this network is complicated by the need for someone to key in the pin of the card. 
The network grows still further if the card reader is connected to a bank, and perhaps dispenses money. 

When networks involve heterogenous nodes, and heterogenous communication channels, then the diverse 
computational resources lead to different computational powers. This is where network computation essen- 
tially deviates from machine computation, where according to Church's Thesis, all the different machines 
have the same computational powers. In a network, one computer may be equipped with a camera and may 
provide a visual channel to a remote user, whereas another computer may be equipped with sensors and 
controllers, allowing it to stabilize the flight of an aircraft. A smart card can perform some cryptographic 
operations when inserted in a reader, and other ones in the contactless mode. A musical instrument can 
produce one type of music with one musician, and something completely different with somebody else. The 
musician also depends on the instrument. Furthermore, after they are configured with their instruments, 
the musicians may further configure themselves into a higher-order configuration: an orchestra. And the 
orchestra can also be viewed as an actor within the configuration of an opera performance. . . 

Such configurations are what we call actor-networks. A computational agent who participates in a con- 
figuration is an actor, in the sense that she plays a particular role assigned to it by a particular network 
procedure. As computational networks spread and diversify, it is becoming increasingly important, and in- 
creasingly difficult, to assure that procedures provide the desired actor and network behaviors. Towards this 
goal, we formalize the above intuitions about actor-networks, and build a framework for reasoning about 
their procedures. 

Remark. The hierarchical structure of our actor-network formalism is alien to the spirit and the letter of 
original actor- network idea from sociology [31]. But it is essential for the goals of our logical analyses, which 
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are different from the goals of sociological analyses. It may be of interest to explore the relation between the 
two sets of goals. 

2.3. Formalizing actor-networks 

Definition 2.1. An actor-network consists of the following sets: 

• identities, or principals J — {A, B, . . .}, 

• nodes TV = {Af,7V, ...}, 

• configurations V =^ {P, Q, . . .}, where a configuration can be 

— a finite set of nodes, or 

— a finite set of configurations; 

• channels C = {/, g, . . .}, and 

• channel types O = {r, s^, . . .} 

given with the following structure: 

B 

where 

• the partial map © : V ^ tells which principals control which configurations, 

• the pair of maps 6, g : C ^ V assign to each channel / an entry 6f and an exit gf, and 

• the map -d : C ^ Q assigs to each channel a type. 

An actor is an element of a configuration. 

Actors formally. By the above definition, a configuration is thus a tree whose leaves are annotated by 
network nodes. A configuration is an actor when viewed as an element of another configuration. In other 
words, an actor is formally a maximal subtree of a configuration tree. It is useful to distinguish the trees 
that come together to form another tree because this is what they do to perform some action together. For 
instance, a hand, a pencil and a piece of paper come together as actors to record a thought. A hand itself is 
a configuration of fingers, which come together to hold the pencil. The pencil is a configuration of graphite 
and wood. A door, a lock and a key come together to enforce someone's authority over a space. The lock 
is a configuration of its metal components. The writing configuration and the locking configuration come 
together to put a novel in a drawer. And all of it is just trees. 

Notation. We denote by Nb a node N controlled by the principal ©N = B. We write g = [P ^ Nb) for 
a channel g of type -dg = r, with the entry 5g = P, and with the exit gg — N controlled by ©N — B. Since 
there is usually at most one channel of a given type between two given configurations, we usually omit the 
label g, and write just P A Nb to denote this channel. 

2.4. Examples of networks 
2.4- 1- Cyber networks 

Cyber networks are built following the "end-to-end" architecture [63j . In our formalism, a cyber network is 
characterized by the fact that 

• C = {cyb}, i.e. there is just one channel type, which we call cyber channel. This is are the insecure channel 
over which cryptographic protocols are usually run. 

• V = Af, i.e. the only configurations are the nodes. 

• There is a channel AI N for every pair M, N E J\f. 
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• All communication is done by broadcast from the sender to all nodes in the network. The recipient does 
not observe the sender directly (although, of course, the sender can identify, or misidentify herself in 
the message). If a principal controls several nodes, it makes no difference which node he uses to send a 
message. Without loss of generality, we can thus assume that each principal controls exactly one node, 
i.e. that N = J. 

The actor-network structure of a cyber network is thus degenerate, and boils down to a single type V = 
M = , since the only configurations are the nodes, and the nodes are in one-to-one correspondence with 
the principals. That is why in crypto protocol analysis, we usually just specify how many different principals 
should play different roles. The fact that any two principals, viz network nodes, are directly connected by 
a completely insecure cyber channel is assumed tacitly. The cyber network with three nodes/principals is 
presented on Fig[T] 

The fact that the cyber channels are insecure can be captured in the model of a cyber network by assuming 
that all traffic is routed through the attacker, i.e. that Alice, Bob and Carol are all linked with each other 
through M allory. In a sense, the attacker Mallory is the embodyment of cyberspace. This architecture is 
presented on Fig. 2. 

2.4-2. An actor-network for two factor authentication 

To mitigate phishing attacks, most online banks have rolled out two factor authentication. This means 
that they do not just verify that the user knows a password, but also something else — which is the 
second authentication factor. This second factor often requires some additional network resources, besides 
the internet link between the customer and the bank. This is the first, quite familiar step beyond simple 
cyber networks. 

In the simplest case, the bank authenticates the browser used to access the service, by leaving a persistent 
cookie. The server often also records some data about user's computer and network location. The user only 
notices this when she tries to access the bank from another location, or using another browser: she is then 
asked to go through a round of "mother's maiden name" type of challenge questions. A more interesting 
type of second factor are the single-use Transaction Authentication Numbers (TANs) that the server may 
generate. Initially, they were be predistributed on paper. Nowadays they are often sent to user's mobile 
phone in an SMS message when login is initiated. The user is thus authenticated as the owner of her mobile 
phone. The other way around, the server is authenticated to be in possession of user's phone number, which 
eliminates the general phishing attacks. 

Some banks authenticate that the user is in possession of her smart card. The underlying actor-network 
is on Fig. |3l The user Alice controls her computer Ca and her smart card Sa- She is also given a portable 
smart card reader R. She inserts the card in the reader to form the configuration Q. The reader is available 
to Alice, but any other reader would do as well. Configured into Q, the smart card and the reader verify that 
Alice knows the PIN, and then generates the login credentials, which Alice copies from i?'s screen to her 
computer Ca's keyboard, which forwards it to bank Bob's computer Cb- The details of the authentication 
procedure will be analyzed later. 

In summary, the network thus consists of 

• principals J — {A, B}, 

• nodes TV = {Ia, Ca, Sa, R, Cb}; 

• configurations V = N U {Q}, where Q = {Sa,R}, 
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Fig. 3. A pervasive network: Online banking with a smart card reader 



vis 




Fig. 4. Actor-network for device handshake 



• and the following six channels 

— cyber channels Ca ^ Cb between Alice's and Bank's computers, 

— visual channel C'a — ^ I a from Alice's computer to her human I a, 

— keyboard I a C'a from Alice's human to her computer, 

— visual channel R ^ Ia from the smart card reader to Alice's human, 

— keyboard Ia ^ R from Alice's human to the card reader. 

2.4-3. An actor-network for device handshake 

Suppose that Alice and Bob have some hand held devices and that they want to pair them, i.e. set up a 
secure cryptographic channel, without any previous encounters or infrastructure. There is a whole industry 
of methods to do this. We describe a method inspired by [43] and [10]. Alice's and Bob's devices Da and 
Db are equipped with accelerometers. If a device with an accelerometer is shaken, then the accelerometer 
can be used as a source of randomness. If two devices are shaken together, their accelerometers will generate 
roughly similar random strings. A shared random string can be extracted by the techniques described in 
[43] , or using fuzzy extractors [19j . We simply assume that a jointly samplable source is given, and denote 
id by the node S. 

We now describe an actor-network supporting a device handshake procedure, where the usual device 
pairing task is strengthened by the requirement that the secret shared by the devices Da and -Ds is also 
bound to the identities of their human owners I a and Ib- Fig. |4] shows an actor- network supporting, in a 
sense, a half of this task: it will allow Alice to shake two devices together, to extract the shared secret; and 
moreover Alice's device Da will biometrically verify that it is being shaken by Alice's human I a, and not 
by someone else. If the device Da signals whether this verification succeeds in a way visible to Bob's human 
Ib, then Bob knows that the key extracted into his device Db is shared with Da, and bound to the identity 
of I A- Alice's human I a can obtain similar assurances in an analogous round of the same procedure. The 
network for both rounds is depicted on Fig. [5l 

Formally, in the actor-network for one round, depicted on Fig. 21 Alice controls the configuration Qa, 
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Fig. 5. Actor-network for two-round device handshake 



which consists of her device D^: Bob's device Dq, and Ahce's hand Ia, holding the two devices together. 
AHce's action of shaking the devices is represented as Qas action of samphng a source of randomness S 
along a devoted channel. The accelerometers are abstracted away, and reduced to the node S. The fuzzy 
extractors are abstracted away and reduced to the fact that the randomness conveyed to Qa is distributed 
to both of the actors Da and Db participating in it. The details of the procedure are analyzed later. Here 
we just summarize that the actor-network for device handshake consists of the following data: 

• two principals in J , Alice and -Bob, 

• five nodes in M: 

— Da and Db are Alice's and Bob's devices 

— I A and Ib are Alice's and Bob's human identities, 

— S* is a source of randomness; 

• one configuration in V 

— Qa is the configuration where Alice holds her and Bob's devices in her hand: it consists of I a. Da 
and Db', 

• eight channels in C: 

— two cyber channels Da ^ Db between Alice's and i3ob's devices; 

— one biometric channel: 

• Ia ^ Da from Alice's hand to her device 

— three visual channels: 

■ Da — > Ia from Alice's device to her eyes, 

• Db ^ Ib from Bob's device to his eyes, 

• Qa — > Ib from Alice's hand, holding both her and Bob's device to Bob's eyes; 

— one physical channel: 

• S Qa from the source of randomness to Alice's configuration, which conveys the randomness 
to Da and Db- 

To assure that each principal contributes to the randomness, the above procedure can be repeated with Bob 
shaking. The actor-network where both Alice and Bob see each other shaking both devices is depicted on 
Fig. El 
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2.5. Metaphysics of security 

2.5.1. Knowing, Having and Being 

It is often repeated that security is based on: 

• something you know: digital keys, passwords, and other secrets, 

• something you have: physical keys and locks, smart cards, tamper-resistant devices, or 

• something you are: biometric features, such as fingerprints, eye irises, and other unmodifiable properties 
of your body; or the capabilities that you cannot convey to others, such as your handwriting. 

An identity can thus be determined by its secrets, its tokens, and its features. In our model, this is captured 
by three levels of control that a principal may have over its nodes: 

• secrets: what you know can be copied and sent to others, 

• tokens: what you have cannot be copied, but can be given away, whereas 

• features: what you are cannot be copied, or given away. 

Comments. The common end-to-end security goals are usually realized by means of cryptographic software, 
and the principals prove their identities by their secrets. In cyber networks, a principal can be identified with 
the list of secrets that she knows. If Alice and Bob share all their secrets, then there is no way to distinguish 
them by the challenges that can be issued on the standard completely insecure network channelfl For all 
purposes, they must be considered as the same principal. 

In pervasive networks, on the other hand, security is also supported by physical security tokens and 
hardware. Formally, this is where the network model becomes nontrivial: security tokens correspond to 
network nodes which may be controlled by one principal at one point in time, and by another one at another 
point. 

Finally, security features correspond to network nodes which are controlled by a single principal, and 
cannot be relinquished. Such nodes correspond to biometric properties. Their counterpart are the nodes that 
represent biometric devices. When all is well, a biometric channel thus has a biometric property at its entry, 
and a biometric device capable of observing this property at the exit. 

2.5.2. What is an identity? 

In a cyber network, anyone who knows all my secrets can impersonate me, and the standard models thus 
assume that an identity is a set of secretsH In a pervasive network, even if someone knows all my secrets, we 
can still be distinguished as long as only one of us has my smart card, or my fingerprints, or if only one of 
us is standing at the door. In the actor-network model, besides the secrets, there are also the various tokens 
and features that a principal may control. An identity is thus a set of actors, which may include tokens and 
features. Formally, the set of principals J can be represented 

• in a cyber network along the injection 

r:J^pT 

which assigns to each identity the set of terms that she knows [56]; and 

• in an actor-network, along the injection 

® : J^pC 

A^-^ {P eC \ ©P ^ A} 

which assigns to each identity the set of configurations that she controls. 

^ Here we assume that they share the secrets with each other dynamically: the secrets it will immediately be shared with the 
other. This implies that they also observe the events on the same set of network nodes. 

^ In reality, even in an end-to-end network, two principals with the same set of secrets but, say, different computational powers, 
can be distinguished by timing their responses. Or they may be distinguished by their histories, since since may have derived 
their secrets from different initial data, as explained in |55l Sec. IV.D.l]. The standard models, however, abstract away all that. 
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For simplicity, we assume here that the storage containing the terms from T(A) is subsumed among the 
nodes ©^4. More about this in Sec. 13.21 

Why do we use the words "principal" and "identity" as synonyms? For the benefit of the readers 
with a background in protocol analysis, let us emphasize that principals do not perform any actions in the 
present model. The principals control their actors, and the actors perform the actions, and play roles. A 
principal determines if and when its actors perform an action, and can thus coordinate the order in which 
the actions of her actors will be executed. But without the actors, a principal cannot execute any actions. 
That is why the alternative term "identity" may be preferred over "principal" . On the other hand, for the 
special case of protocols, the concept still boils down to the familiar idea of principals, who play their roles 
in protocols, etc. So we retain that term as well. 

Metaphysics of actors and principals. Intuitively, the relation between a principal and her actors in an 
actor-network can be construed in terms of the mind-body duality: the principal is the mind, and the actors 
are some parts of the body, that the mind can use to observe the world and to act in it. The data received 
or sampled by some actors through the suitable channels are directly available to the principal: e.g., the 
principal may control a camera, and observe the visual signal that the camera receives. On the other hand, 
some other actors may not convey their data to the principal that controls them: the camera may have no 
cable, or not enough light. The body has its limitations. Which type of information each actor conveys to 
its principal must be specified by the procedure specific axioms. Such specifications determine the semantics 
of each model and the intent of each actor-network procedure. 

2.5.3. What are the channel types? 

Some of the channel types that we shall study are: 

• cyber channels: each node broadcasts to all nodes; there is no notion of distance; the recipient cannot 
observe the sendeiQ 

• visual channel: the events at all nodes within some distance are observed; the observed nodes may or 
may not observe that they are observed; 

• binary channel: streams bits from one node to another, flipping them with some given probability. 

The binary channel is one of the basic concepts of information theory, capturing a simple notion of random 
noise. Intuitively, the cyber attacker can be viewed as "adaptive noise" ^ disturbing the integrity of the 
messages. 

3. Actor-network processes 

3.1. Computation and communication 

Computation in a network consists of events^ which are localized at nodes or configurations. An event that 
is controlled by a principal is an action. 

Communication in a network consists of information flows along the channels. Each flow corresponds to 
a pair of events: 

• a write event at the entry of the channel, and 

• a read event at the exit of the channel. 

There are two kinds of flows: 

• messages, which consist of a send action at the entry of the channel, and a receive coaction at the exit; 
and 

• sources, which consist of an sample action at the exit, and a emit coaction at the entry. 



* We can assume that the sender always includes her identity into the message, within some standard format such as email. 
But such source claims can be easily spoofed in cyber space. 
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Table 3. Flows and events 

The information flows and the corresponding events are summarized in Table [3l The black dots mark the 
actions. A consistent action-coaction pair is called an interaction: i.e., an interaction consists of a send action 
and a receive coaction, or of a emit coaction and a sample action. We presently consider only these two types 
of interactions. Both the receive coactions and the emit coactions are construed as passive events: neither 
the principal who receives a message, nor the one who emits from a source controls when this happens. Of 
course, in reality a principal may, e.g. actively refuse to receive a message, or to emit emit a source, etc. But 
our goal is to enable simple analyses, and we leave these details outside the scope of the general model. 

A computational process that is localized at a node proceeds as in the traditional models of computation. 
The node can be thought of as a state machine (e.g. a Turing machine), and the computational events change 
its state. An event at a configuration P may changes the states of any of the actors N € P. In the actual 
analyses, the state changes often need to be traced, but we did not encounter an example where an actual 
state machine would need to be specified. The most abstract models that capture the relevant features 
usually support the simplest analyses, hiding the implementation details. 

Besides transferring information from one configuration to another, the flows also synchronize the events 
that take place at different localities, because: 

• every receive coaction must be preceded by a corresponding send action, and 

• every sample action must be preceded by a corresponding emit coaction. 

If a source has not been emitted to anywhere, then there is nothing to sample, and no sampleion of that 
source can occur. If a message has not been sent, then the corresponding receive event cannot occur. So 

• when I receive a message, then I know that it must have been sent previously by someone; and 

• when I sample a source, then I know that someone must have emited to this source. 

That is how I draw conclusions about non-local events from the observations of my own local actions. This 
is formalized in Sec. 14.3.11 

Intuitions and ideas. A configuration can be thought of as a mechanism, assembled from separate com- 
ponents that may be owned and controlled by different principals. Another view is that a configuration is 
like a team, composed of the players that came to act together towards some goal, but will separate and 
go their own ways when they are done. The usual (physical) handshake, confirming a social contact, can be 
viewed as a configuration. A couple dancing together is a configuration. A band playing music together is a 
configuration. 

A channel is like a wire, connecting two configurations. The messages and the sources are two types of 
flows through a channel. If Alice wants to send a message, she needs a channel to send it on. Bob is on the 
other side of the channel, passively waiting to receive the message. If Bob wants to sample a source, he needs 
a channel for that. Alice is on the other side of the channel, passively emiting. In both cases, Alice is at the 
entry of the channel, and Bob is at the exit of the channel. 

Besides the channels that connect it to other configurations, a configuration may have internal methods 
for coordination among its nodes. E.g., a handshake is coordinated by two hands sensing each other. A 
couple of dancers develop signals that coordinate their dance. In some cases, such signals need to be made 
explicit as information flows. A configuration may need to send itself a message, or to sample itself as a 
source: e.g., to assure that genuine randomness is extracted. While the internal signaling and coordination 
capture controlled processes within a configuration, there are many processes that take place within a single 
configuration that are not entirely controlled. Dance groups and the bands of musicians have, besides the 
subtle forms of internal signaling, evolved complex external procedures to synchronize and coordinate. Such 
procedures can be thought of as primordial social software. Analyzing them within a formal model might 
conceivably open interesting possibilities of electronic support. 
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3.2. Formalizing data as terms 

Each flow carries some data, which contain information. In abstract models, data are represented as terms 
of an algebra: the content of a message is an element of an algebra. We shall also represent the emission from 
a source as an element from the same algebra. The algebraic operations correspond to the data processing 
operations. In the standard symbolic protocol model [20j, the messages the terms of a free algebra of encryp- 
tion and decryption operations. More general algebraic models allow additional operations, and additional 
equations [12]. Recall that an algebraic theory is a pair (O, E), where O is a set of finitary operations (given 
as symbols with arities), and E a set of well- formed equations (i.e. where each operation has a correct number 
of arguments) ^31. 

Definition 3.1. An algebraic theory T ~ {O, E) is called a data theory if O includes a binary pairing (— , — ) 
operation, and the unary operations tti and tt^ such that E contains the equations 7ri(u, v) = u, tt2{u, v) — v, 
and ((a;, y) , z) = (x, (y, z)). A data algebra is a polynomial extension T[X] of a T-algebra T. 

Function notation. When no confusion seems likely, we elide the function applications to concatenation, 
and write f.x instead of f{x). A function of two arguments e(x, y) is thus identified identified with its curried 
form e.x.y, and e.x abbreviates e{x, —). By abuse of notation, the pair {x, y) can thus be written as x, y, and 
{x, {y, z)) ^ ii^, y)iz) x, y,z. 

When no confusion is likely, we even elide the dot from the concatenation and simply write fx instead 
of f.x, or f{x). 

Tupling. The equation {x, (y, z)) — ((x, y), z) in the above definition implies that there is a unique n-tupling 
operation for every n. The first two equations imply that the components of any tuple can be recovered. 

Random values are represented by indeterminates. A polynomial extension T\X] is the free T- 
algebra generated by adjoining a set of indeterminates A" to a T-algebra T [311 §8]. The elements x,y, z . . . 
of X are used to represent nonces and other randomly generated values. This is justified by the fact that 
indeterminates can be consistently renamed: nothing changes if we permute them. That is just the property 
required from the random values generated in a run of a protocol. Of course, this is not the only requirement 
imposed on nonces and random values: the other requirement is that they are known only locally, i.e. only 
by those principals who generate them, or who receive them in a readable message. This requirement is not 
formalized within the algebra of messages, but by the binding rules of process calculus [T71 [35]. Here we 
capture it by the freshness axioms in Sec. 14.3.21 

Stores are nodes. While the random values are thus algebraically presented as the indeterminates (i.e. as 
the variables in the polynomial extension), the stores (i.e. the variables used in computation) can be modeled 
as network nodes, each with a devoted read channel and a write channel. The property of such a node which 
is that it can store a value. The term stored in such a node determines its state. In this way, the usual notion 
of state, as a partial assignment of values to variables, is included within the network model. The state of 
a configuration is thus the product of the states of its actors, where some of the actors only task is to store 
some values. 

Easy subterms. We assume that every data algebra comes equipped with the easy suhterm relation C. 
The idea is that that s Q t implies that s is a subterm of t such that every principal who knows t also 
knows s. In other words, the views Ta are lower closed under C, as explained in [SSj. This is in contrast 
with hard subterms, which cannot be extracted: e.g., the plaintext m and the key k are hard subterms of 
the encryption E.k.m. In the Dolev-Yao algebra, it is straightforward to define the easy subterm relation 
inductively. For general algebraic theories, the task of discerning the subterms gets complicated. A general 
treatment was attempted in j56j . 

3.3. Formalizing events and processes 

In this section we define processes, the events that processes engage in, and the ordering of events within a 
process. 
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3.3.1. Events 

An event or action is generally written in the form a[t] where 

• a is the event identifier, 

• i is the term on which the event may depend. 

When an event does not depend on data, the term t is taken to be a fixed constant i = /, and we often 
abbreviate a[/] to a. 

The most important events for our analyses are the action-coaction couples send-receive, and sample-emit, 
for which we introduce special notations: 

• send (-i-), receive {-t-), 

• emit {■ t :), sample (: t :). 

Generically, we write 

• (t) for a write action, which can be either (• i •) or (: t :), and 

• ( i ) for a read action, which can be either (• i •) or (: t :). 

Another often used action is 

• generate a random value v[x], 

It could also be implemented as sampling a source of randomness represented as a devoted node. 

In addition, the nodes are capable of performing various local operations. Most are able to execute the 
standard pseudo-code commands, like comparisons (t = s) or assignments (t := s). But the differences in 
their computational resources will be essential in some of the security analyses of the procedures below. 
Further examples of application specific events and actions will be introduced in the below. 

For actions, such as {-t-) and (: t :), the configuration P must be controlled, i.e. the partial function 
© : N ^ J must have a definite value ©P. 



Representing events as terms. How do we represent principals' observations of events and of other 
principals' actions? The location of an event may be viewed as a source for sampling; the location of an 
action must be controlled by a principal, who may be viewed as the sender of the message about the action 
that took place. But since only data can be sent as messages, or sampled from sources, each observable 

action a[t]p must be represented as a term a[t]p . In general, this is done by adding to the presentation of 

the algebra T a mapping 

[~] : T 

which generates a representation of each event from E. A similar map assigning to each action a logical 
formula leads to dynamic logic [HI]- We'll see a typical example of a message about an action in Sec. 13.5. 3[ 
where we return to the device handshake procedure. Bob can only conclude that the secret shared by his 
and Alice's device is authentic if he sees Alice shaking the devices. 

Self-sampling. A less typical, but not less interesting example arises if we assume that a configuration has 
a channel to sample its own events. Such a configuration can then sample its own sampling, i.e. execute 
the action (: [(::)] :). The principal who controls such a configuration can then observe some of her own 
observations. It is interesting to explore the authenticity of such observations. We shall touch this again in 
Sec.HXH 

3.3.2. Processes 

Definition 3.2. A process is a partially ordered multiset of localized events, i.e. a mapping 

J" = (J^E,^p) : E -> E X P 
where 

• (E, -f) is a well-founded partial order, representing the structure time. 
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• E is a family of events, and 

• {"P, ^) the partial order of configurations, 

and they satisfy the requirements that 

(a) if J-e4> is an action, then @{J--p(j)) is well defined, and 

(b) if -> in F then J--p<j) C J--p^ or J--p4' 3 ^vi-' in V . 

Notation: The points in time are denoted by events. By abuse of notation, we usually write a[t\p 
for (f> £ J- where J-ec/) = a[t\ and J-p = P. Of course, if the there are several points in time 0i, (/)2, . . . G J- 
where the same P executes the same a[t], then this notation is ambiguous, since it is not clear to which (f)i 
does a[t]p refer. But such situations are rare. On the other hand, with this notation the above conditions 
become: 

(a) if an action takes place at a configuration P, then P is controlled, i.e. ©P must be well defined, and 

(b) if a[t]p ^ b[s]Q then P CQ or PDQ. 

Remarks. The subset ordering of {V, C) arises from Def. I2.1[ which says that configurations are finite 
sets. Partially ordered multisets, or pomsets were introduced and extensively studied by Vaughan Pratt and 
his students [62]. Condition (a) specifies what we already said informally: that the configuration where an 
action takes place must be controlled. Condition (b), on the other hand, means that there is no subliminal 
synchronization: the ordering of events can only be imposed within a configuration that enables all of them. If 
Alice performs one action controlling a configuration Pa , and then another action controlling a configuration 
Qa, then she must control a configuration Ra 3 Pa U Qa, that will allow her to control the order of the 
actions with Pa and with Qa- The intended use of configurations, including any constraints that would 
require that some parts of Pa and Qa should not be used together, must be imposed through axioms, in the 
PDL language introduced in Sec. H] 

Definition 3.3. We say that the term t originates at the point cf) E ii (f> is the earliest write of a term 
containing t. Formally, (j) thus satisfies 

• J-E(t> = ( s ) where t C s, and 

• Je£, — ( s ) A i C s =^ <P ^ ^ holds for all events ^. 



Notation: Origination. We extend the notational conventions described above by denoting by \J {{t)) p 
the event 4> where the term t originates. The configuration P is the originator of t. 



3.4. Formalizing flows, runs and procedures 

We now extend our discussion to the definition of communication between processes, and extend our ordering 
to events occurring within a procedure as well as individual processes. 

We begin by defining a more general version of channel between two configurations, called a fiow channel. 
A flow channel exists between any two configurations if a channel exists between any two nodes on the 
configuration trees. It is called a flow channel because the information passed along the channel flows upwards 
to the configuration as a whole. It is defined formally below. 

Definition 3.4. For configurations P, Q £ V , a flow channel P Q can be either 

• a channel P ^ Q, or 

• a fiow channel P Q' , where Q' E Q, or 

• a fiow channel P' Q, where P' E P, or 

• a fiow channel P' Q' , where P' E P and Q' E Q. 

A flow a[t]p b[s]Q is given by 

• a fiow channel P Q, and 
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• an interaction pair a[t\, b[s], i.e. a pair where 

— either a[t] — (• i •) and b[s] = (• s •), 

— or a[t] — {: t :), and if b[s] — {: s :). 

A flow a[t]p b[s]Q is complete if s = t. 

Definition 3.5. Let J-" be a process. A run, or execution of J- is an assignment for each coaction b[s]Q 
of a unique flow a[t]p — ^ ^[s]qj which is required to be sound, in the sense that 6[s]q -/> a[t]p in J-. 

A run is complete if all of the flows that it assigns are complete: the terms that are received are just those 
that were sent, and the inspections find just those terms that were submitted. 

A run is a pomset extending its process. Setting a[t]p ^ b[s]Q whenever there is a flow a[t]p b[s]Q 
of some type r makes a run into an extension of the ordering of the process as a partially ordered 
multiset. The pomset does not have to satisfy condition (b) of Def. I3.2l anv more. Indeed, the whole point 
of running a process is to extend in E''^ the internal synchronizations, given by the ordering of J-, with the 
additional external synchronizations. 

Overloading arrows. The view of the runs as order extensions of the processes justifies the overloading of 
the arrow notation, which is used both 

• as a[t]p -> f'[s]Q, saying that a[t]p precedes 6[s]q in the partial ordering {J-, ~>), and 

• a[t]p b[s]Q, denoting a flow of type r from a[t]p to 6[s]q in a run 

This overloading is consistent, because a flow from a[t]p to 6[s]q implies that a[t]p precedes 6[s]q; and it will 
be useful when we pass from the runs, e.g. in Figures 6-8, to formal reasoning about them in Figures 9-12. 
The arrows in the latter family of diagrams arise from the arrows in the former family. But the former 
represent reality, whereas the latter represent assertions about it. 

Definition 3.6. A network procedure £ is a pair C = {J-c,Ec) where 

• J-c is a process, and 

• Ec ^ {£i'^ ,£2'^ ,£^'^ . . .} is a set of runs of Fc- 

The elements of Ec are called secure runs. All other runs are insecure. A procedure is said to be secure if 
every insecure run can be detected by a given logical derivation from the observations of a specified set of 
participants. 

Procedures generalize protocols. A protocol is a special case of a network procedure, where the un- 
derlying network is a cyber network. Since cyber channels offer no security guarantees, the security goals 
of protocols are generally realized by cryptographic functions computed at the nodes. That is why cyber 
security is largely concerned with cryptographic protocols. It is thus based on the end-to-end paradigm, 
where the security tasks are pushed to the local computations at the smart "ends", i.e. nodes, leaving the 
network simple and efficient. The above definition of a secure procedure generalizes the definition of a secure 
protocol used in Protocol Derivation Logic [48, 561 [57], as well as in Protocol Composition Logic p¥llT71[T5] . 
Network procedures and their security proofs thus extend cryptographic protocols, and their security proofs. 
What is the difference? First of all, any node in a cyber network is as good as any other node, so it does 
not matter which ones you control, or how many. Without loss of generality, configurations can thus be 
reduced to single nodes, and the channel fiows to the messages on cyber channels. A protocol run thus boils 
down to a sequence of messages among the principals, each usually controlling a single node, with some local 
computations in-between the messages. 

Nontrivial configurations arise in pervasive networks. E.g., a smart card can only compute when inserted 
into a correct configuration with a card reader. Moreover, the information can flow through a pervasive 
network in many different ways: by messages sent along a variety of different channels, short range, cellular, 
social, etc.; or by observations along visual channels, etc. The distinction between computation and commu- 
nication in pervasive networks becomes blurred, as the two become intertwined in subtle and complicated 
ways. 
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Fig. 6. Challenge-Response (CR) protocol template 

Graphic presentations of procedures. To specify a procedure we draw a picture of the pomset 
T = Tc, and then each of its extensions E — E^'^ . Because of condition (b) of Def l3.21 the events comparable 
within the ordering of a process J- must happen within a maximal configuration. Therefore, if the diagram 
of the partially ordered multiset J- is drawn together with the underlying network, then each component 
of the comparable events can all be depicted under the corresponding configuration. We can thus draw the 
network above the process, and place the events occurring at each configuration along the imaginary vertical 
lines flowing, say, downwards from it, like in Fig. [T] The additional ordering, imposed when in a run £ the 
messages get sent and the facts get observed, usually run across, from configuration to configuration. This 
ordering can thus be drawn along the imaginary horizontal lines between the events, or parallel with the 
channels of the network. Such message fiows can also be seen in Fig. [71 The dashed lines represent the data 
sharing within a configuration. 
This discipline of drawing 

• the internal ordering of events along the verticals and 

• the external ordering, imposed by the flows, along the horizontal lines is 

of course, familiar from strand spaces, where the verticals are the strands, and the horizontals the bundles 
[291 . Our diagrams indeed boil down to strand diagrams whenever the network configurations are single nodes 
connected by cyber channels, and when the only fiows are messages. This graphic convention for depicting 
the internal and external ordering of events goes back to the early days of distributed computing research, 
see e.g. [38ll44] . 

3.5. Examples of procedures 

3.5.1. Challenge response authentication protocols 

We begin a familiar special case of a procedure: a protocol. A large family of challenge-response authentication 
protocols is subsumed under the template depicted on Fig. [5] Bob wants to make sure that Alice is online. 
It is assumed that Alice and Bob share some sort of a secret k^^ , which allows them to define functions c^^ 
and r^^ such that 

• r^^ X can be computed from c'^^x using s'^^, but 

• r^^ X cannot be computed from c^^x alone, without s^^ . 

So Bob generates a fresh value x, sends the challenge c^^x, and if he receives the response r"^^x back, he 
knows that Alice must have been online, because she must have originated the response. The idea behind 
this template has been discussed, e.g., in [48l [TTl [56l [57] . The template instantiates the concrete protocol 
components by refining the abstract functions c"^^ and r^^ to concrete implementations, which satisfy the 
above requirements: e.g., c^^ may be the encryption by Alice's public key, and r^^ may be the encryption 
by Bob's public key, perhaps with Alice's identity. 

Recall from in Sec. 12.4. II that cyber networks are degenerate, in the sense that the actors boil down to 
the principals. Alice's and Bob's unique actors are thus simply denoted by A and B. 
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3.5.2. Two-factor authentication procedure 

Next we describe the first nontrivial procedure, over the actor-network described in Sec. 12.4.21 It can be 
viewed as an extension of the simple challenge-response authentication. There, Bob authenticates Alice 
using her knowledge of a secret s"^^, which they both know. Here Bob authenticates that that knows a 
secret p"^ that Bob does not know, and that she has a security token Sai in this case a smart card. The 
secret and the smart card are the "two factors". This is the idea of the procedure standardized under the 
name Chip Authentication Programme (CAP), analyzed in |23j . The desired run of the challenge-response 
option of this procedure is depicted on Fig. [T) 

Wc assume that, prior to the displayed run, Alice the customer identified herself to Bob the bank, and 
requested to be authenticated. Bob's computer Cb then extracts a secret s^^ that he shares with Alice. 
This time, though, the shared secret is too long for Alice's human I a to memorize, so it is is stored in the 
smart card Sa- Just like in CR protocol above. Bob issues a challenge, such that the response can only be 
formed using the secret. So Bob in fact authenticates the smart card Sa- He entrusts the smart card Sa with 
authenticating Alice's human I a- This is done using the secret p^ shared by I a and Sa- The secret is stored 
in both nodes. To form the response to Bob's challenge, Alice forms the configuration Q by inserting her card 
Sa into the reader R. The configuration Q requests that Ia enters the secret PIN (Personal Identification 
Number) p"^ before it forms the response for Bob. There is no challenge from Q io Ia, and thus no freshness 
guarantees in this authentication: anyone who sees Ias response can replay it at any time. Indeed, the 
human I a cannot be expected to perform computations to authenticate herself: most of us have trouble even 
submitting just the static PIN. The solution is thus to have the card-reader configuration Q computes the 
response, which Alice relays it to Bob. The old PIN authentication is left to just convince Q that Alice's 
human I a is there: Q tests p'^, sent through the keybord channel from I a to the reader _R, coincides with 
p^ stored in the card Sa, and then generates a keyed hash Hs^^x using the shared secret s^^ and the 
challenge x. This hash is displayed for Alice on the card reader R as the response r, which Alice then sends 
to her computer Ca by the keyboard channel, and further to Cb by the cyber channel. 

This two-factor procedure is thus more secure than the simple password authentication of I a to Db 
because 

• there is a fresh challenge, and the attacker cannot impersonate Alice just by recording one session (like 
phishermen do); 

• even in the option without the challenge, the secret s^^ , shared between two computing devices, is 
generally stronger than a human memorable secret p'^, and finally 

• the PIN authentication to the smart card is not cryptographically strong, but it is done on a physical 
channel, which is harder to attack. 

If a thief comes in the possession of the smart card Sa, fie cannot use it without p"^ , stored in I a- This 
leads to the muggings, i.e. attacks where Sa is stolen, and then I a is coerced to disclose the PIN p"^. The 
authors of [23] point out that introducing the portable, generic readers R simplifies for the attacker the 
verification that the number given to him by I a is indeed the correct p"^ . 

3.5.3. Device handshake procedure 

Going back to Fig. 21 we describe the desired run on the Device Handshake. The run begins by Alice bringing 
together the nodes for the configuration Qa- This means that she takes her device Da and Bob's device Db 
into her hand I a, to shake them together. We also assume that the configuration contains the node S, which 
is the source of randomness. This node does not correspond to a physical object, but embodies the source 
that is being sampled by shaking. In reality, the two devices (i.e. their accelerometers) record the joint action 
of shaking together, and sample the shared secret out of it. The method to achieve this is described in [43,, 
and we take it as given. The desired run on the network from Fig. U] is depicted on Fig. [Sj It consists of five 
flows, triggered by Qa^s shaking of the devices: 

• {: X :)q^ <^^^ (: x :)g, caused by Qa^s sampling of x from S; 

• (: '■) Da '"^Ia' sinrmultaneously caused by Ias sampling of Ias fingerprint f^] 

) , where Qa shows Ib how she shakes the devices to sample x: 
> 1b 
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• {' ^ ')da ^' ^ '^Ib' confirming to Ib that Da received x and correct f^; and 

• (• / (• / confirming to Ig that Dg received x. 

Remark. Note that the third flow contains an example of a message about an action, along the lines 
explained in Sec. 13. 3[ in the paragraph about representing events as terms. Here Bob's human Ia receives 
from Alice's configuration Qa the visual message [ (::) ] that she has sampled the randomness. 

Two-round device handshake. If both Alice and Bob need to be assured that a secret is shared, then the 
last two messages, where Da and Dg confirm that they succeeded to sample x, and one of them confirms that 
the fingrcprint is correct, should be sent to both I a and Ib- Running two rounds of the device handshake, 
with both Alice and Bob shaking their devices in the actor- network from Fig.[5l would require such procedure 
in each of the two rounds. 



4. Procedure Derivation Logic 

4.1. Procedures are distributed predicates 

Formal methods for software engineering have been built upon Hoare's slogan that "Programs are predicates" 
[33) . The view that computations can be adequately approximated by their logical descriptions, and proven 
correct, was the guiding principle of theory and practice of software specifications from the outset in the 
1960s |30j . The underlying assumption is that software is run within a single computer, with a global observer 
asserting the predicates. In contrast, the essence of a network is that there is no global observer. The events 
at each node may be directly observed only by the principal who controls that node. In addition, they may 
be indirectly observed along an authentic channel that ends at that node. 

The upshot is that each participant of network computation may observe different events, and assert 
different predicates. Each of them sees just bits and pieces of the network process. In most cases of interest, 
their common goal is to coordinate their observations, and arrive at a coherent joint view of the events in 
which they jointly participate, so that they can all assert the same predicate, which is the required security 
property. Towards this goal, they use the channels between them to exchange messages, and to observe each 
other. 

In summary, our goal is thus to extend formal methods from programs as predicates to network procedures 
as families of predicates, asserted at different network nodes. The task of reconciling these local assertions 
is usually formalized as authentication. In this section, we gradually introduce the language and the axioms 
that allow participants of network computations to annotate their local observations by global predicates, 
building up from the basic communication axioms, towards authentication, and beyond. 

4.2. The language of PDL 

A statement of PDL is in the form A : $, where A G ^7 is a principal, and $ is a predicate asserted by A. 
The predicate $ is formed by applying logical connectives to the atomic predicates, which can be 

• a[t]p — meaning "the event a[t]p happened"; or 

• a[t]p -> 6[s]q — meaning "the event a[t]p happened before 6[s]q". 

Notation: Statements assert events, and events describe points of time. Recall that the expressions 
like a[t]p refer to points in time descriptively, i.e. by specifying what happened and where. As explained in 
Sec. 13.31 this is a notational abuse, but an important one. The descriptions a[t]p are sometimes refined to 
■y/ a[[i]]p, which say that the event a[s] took place at the configuration P, for some s □ t, and that the term 
t originates there. 



The essence of PDL. Here, the event descriptions like a[t]p and •\/a[[i]]p moreover denote the assertions 
that the described event took place. The descriptions of processes and of their runs are used as the predicates 
to annotate them, and to reason about them. This notational abuse is justified by the isomorphism of 
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process executions and their logical annotations. This is the basic design decision on which PDL is based: 
The descriptions of processes and of their runs as partial orders are used as the logical assertions to annotate 
these processes and their runs. This is the formal implementation of the guiding principle of PDL, explained 
in Sec. ITXOl 

This isomorphism does not only simplify notation, but substantially simplifies the logic that we work 
with. A statement in any protocol or procedure logic is an assertion of a participant — made at a certain 
point of a run. In PCL, this dependency on the run was expressed using dynamic modalities. In PDL, though, 
the process expression within a modality isomorphic to a logical formula. So instead of 

• A : [^p]^, saying that A knows that $ is valid after the execution point tp is reached, we can write 

• A : 'ii ==> <I>, saying that A knows that $ is valid whenever the description of i/; is valid. 

The two formulas are semantically equivalent because the formula is a complete description of the process 
tp. The PDL assertions thus usually appear in the form A : ^ $, where ^' describes A's view of the run, 
and $ her conclusion about it. 
The examples follow. 



4.3. Communication axioms 

The statements of PDL describe the events that happen in a run of a process, and their order. The basic PDL 
statements are its axioms, which we describe next. They are taken to be valid in all runs of all processes. 
The other valid statements are derived from them. 

4.3.1. Origination 

The origination axioms say that any message that is received must have been sent, and that any source that 
is sampleed must have been emitted to. This has been explained early in SecO More precisely, any principal 
that controls a configuration P where a message is received knows that it must have been sent by someone, 
no later than it was received; and similarly for a source that is sampleed. Formally 



4-3.2. Freshness 

In Sec. I3.2l we explained the idea of modeling random values as the indeterminates in polynomial algebras of 
messages. The freshness axiom extends this idea to processes, by requiring that each indeterminate x must 




(orig.m) 
(orig.s) 



be 



• freshly generated by an action vlx] before it is used anywhere; and 

• that it can only be used elsewhere after it has passed in a message or a source. 

which formally becomes 




(fresh. 1) 




(fresh. 2) 



where, using the easy subterm order □ from Sec. 13.21 

• ((' ^ '))x abbreviates 3t. x \Zt A {-t-)-^, 

• ((' ^ '))x abbreviates 3t. x \Zt A {-t-)-^, etc. 
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4.4. Authentication axioms 

In classical logic, a statement may be true or false. In classical formal methods, an assertion about a com- 
putation is also either true or false, and it is assumed that we can observe which one it is. The idea is that 
we can, e.g. inspect the program variables in the debug mode. In network computation, an assertion is still 
either true or false — yet none of the participants may be able to observe whether it is true or false. E.g., 
when Alice receives a message on a cyber channel, she may not be able to verify whether the statement "This 
message is from Bob" is true or false. The process of verifying such non-local statements by local means is 
authentication. In our model, there are two forms of authentication: 

• interactions along authentic channels, and 

• challenge-response authentication. 

4.4- Interactions along authentic channels 

An authentic channel allows at least one of the participants to observe not only the events on their own 
end of the channel, but also on the other end. So there are four types of authentic channels, supporting the 
following assertions: 

©P : {■t-)p ^ {■t-)Q (auch.m.l) ©P : {: t :) p ^ {: t :) q (auch.p.l) 

©Q : {■t-)p^ {■t-)Q (auch.m.2) ©Q : {: t :) p {: t :) q (auch.p.2) 

Channels that satisfy auch.m.l or auch.p.l are called wnte-authentic; channels that satisfy auch.m.2 or 
auch.p.2 are called read-authentic. Here are some examples from each family: 

• A keyboard channel guarantees to the sender that the device at which she is typing is receiving the 
message, and thus satisfies (auch.m.l). 

• A visual channel used for sending a message allows the receiver to see the sender, and satisfies (auch.m.2). 

• When my fingerprints are taken, I observe that they are taken, and can see who is taking them, so this 
biometric channel satisfies (auch.p.l). 

• Moreover, the person taking my fingerprints also observes that they are taking my fingerprints, so 
(auch.p.2) is also satisfied. 

• If a visual channel is used for surveying, then the surveyor sees where the display appears, and thus 
satisfies (auch.p.2) as well; etc. 

Besides these assertions about the order of events, some authentic channels support other assertions. 
They are usually application specific, and we impose them as procedure specific axioms. 

Authenticity of self-sampling. One particular authenticity axiom worth mentioning is the statement 
that the self-observation channel is authentic, at least for sampling the sampling actions, i.e. 

©P : (: [(::)p] :)^^(::)p (cog) 

In other words, "If I observe that I have observed something, then I have really observed something" . This 
is the PDL version of Descartes' authentication of the world: "Cogito, ergo sum". 

4-4-^- Challenge-response authentication 

The challenge-response axiom is in the form 

©P : Localp =^ Globalpg (cr) 
where, using the notation from Sec. 14.3.21 

Localp = iy[x]p ^ {■ c^'^x ■) p -> (■r^^x-)p 

GlobalpQ = iy[x]p ^ (-c^^x.)^ ((• c^Qx - ^((- rPQx {-r^Qx^p 

Translated into words, (cr) says that the owner ©P of the configuration P knows that 
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Fig. 9. The graphic view of (cr) axiom 
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Fig. 10. Challenge-response using signa- 
tures 



• if he generates a fresh x, sends the challenge c^^x, and receives the response r^'^x, 

• then Q must have received a message containing c^'^x after he sent it, and then she must have sent a 
message containing r^^x before he received it. 

Using (cr), from certain observations of the local events at P, the principal ©P can thus draw the conclusions 
about certain non-local events at Q, which he cannot directly observe. Fig. [9] shows depicts this reasoning 
diagrammatically. This axiom should be viewed as an assertion about the functions c^''^ and r^'^ . They 
must be such that Q can compute r^'^x from c^'^x, but no one else can do it0 



Remark. The (cr) axiom, and the corresponding protocol template, displayed on Fig. 9, has been one of 
the crucial tools of the Protocol Derivation Logic, all the way since [IHHH], through to [57] . 



5. Examples of reasoning in PDL 

5.1. On the diagrammatic method 

In its diagrammatic form depicted on Fig. 9, axiom (cr) says that the verifier P, observing the local path on 
the left, can derive the path around the non-local actions on the right. This pattern of reasoning resembles 
the categorical practice of diagram chasing [42l [54] . Categorical diagrams are succinct encodings of lengthy 
sequences of equations. Just like the two sides of the implication in (cr) correspond to two paths around 
Fig. 9, the two sides of an equation are represented in a categorical diagram as two paths around a face of that 
diagram. The components of the terms in the equations correspond to the individual arrows in the paths. 
The equations can be formally reconstructed from the diagrams. Moreover, the diagrams can be formally 
combined into new proofs. The algebraic structures are thus formally transformed into geometric patterns. 
After some practice, the geometric intuitions begin to guide algebraic constructions in the formal language 
of diagrams. We apply a similar strategy to PDL. 

5.2. Cryptographic (single-factor) authentication 

We begin with a very simple example of diagrammatic reasoning, present already in [48] . 



^ In the cases when c^'^ and r^'^ are based on a secret shared between P and Q, then P can compute r^'^ as well. In such 
cases, the soundness of (cr) depends on ©P's observation that P has not done that. 
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Theorem. The functions 

c^'^x = X r^'^x — (;'^x 

implement (cr), provided that the abstract signature function ^ satisfies the following axioms: 

(a) c^'^u — (;^v =4> u — V, i.e., <j'^ is injective, 

(t>) \/ (( 'i'^t ))x =^ ^ — Q: i-S-, '^^t must originate from Q, 

(c) V'^.u.t u — <;^t, i.e., the predicate is satisfied just for the pairs u,t where u — <,'^t, 

and that these axioms are known to the principal Bob = ©P. 

Proof. To prove the claim, we chase the diagram on Fig. 10. The numbered arrows arise from the following 
steps: 

1. Bob = ©P observes v[x\p {■x-)p r\V'^rx ■) , i.e. after sending a fresh value x, he receives a 
response u which passes the verification V'^rx. 

2. Using the axioms (c) and (orig.m), he concludes that there is some X such that (-V^x-)^ {■r\V'^rx-)p. 

3. Using (fresh. 2) he further derives that for the same X holds {■x-)x ^ ((• X ^ {■V'^x-)x- 

4. Using (a) and (b). Bob concludes that V'^x must have originated from Q. 

Bob can, of course, only be sure that Q was online between his (• x ■)p and (• r\V^rx •) , and not that Alice 
= ©Q really intended to respond to his challenge. It is well known that this form of authentication is open 
to impersonation, since r = V''^x contains no reference to Bob or to P. □ 



5.3. Pervasive (two-factor) authentication 

Next we describe how Bob the bank authenticates Alice the customer in the CAP procedure. 

Theorem. The procedure on Fig.[7]implements authentication, i.e. satisfies (cr), provided that the following 
assumptions are true, and known to Bob: 

(a) Hu — Hv u ~ V, i.e., H is injective; 

(b) \/{{s^^))x =^ X = SaV X = Cb, i.e., s^^ must originate from Sa or Cb] 

(c) VuF^y^ =i> X = Ia^ X = Sa, i-e., must originate from I a or Sa] 

(d) {■Hs'^^x-)^ =^ (^(•p^,a;-)Q ^ (-iJs^^a;-)^) Ap-^ , i.e., 5a and i? are honest. 

Proof. Prior to the displayed execution, Alice is assumed to have sent to Bob her identity, and a request 
to be authenticated. Following this request, Bob's computer Cb has extracted the secret s^^ from a store, 
which he will use to verify that Sa has generated the response. 

To prove the claim, we chase the diagram on Fig. [11] The enumerated steps in the diagram chase 
correspond to the following steps in Bob's reasoning: 

1. Bob observes v[x\cb ~^ {'^')cb ^ i' ^^^^ ^ ') Cb' 

2. Using (orig.m) he concludes that there is some X such that {■ Hsax-)^ ^ {' ^^^^^')c " 

3. Using (fresh. 2) he further derives that for the same X holds {■x-)q^ ((• x ■))x -> {' Hs^^ x ^. 

4. By (a) and (b), from the observation that he did not use s"^^. Bob concludes that Hs^^x must have 
originated in a configuration Q containing Sa- 

5. By (c), ((• p"^ '))/^ ~* ((■ '))q ^ {p"^ = P'^) ^ (i/s"*^a;) , where the last action abbreviates (r :— Hs'^^x), 
and we write out r as Hs^^x in the rest of the diagram. (See Remark below.) 

6. Since Q had to also receive x before computing the response in (^■p'^,x-)p (^Hs^^x) follows by (d). 
So ((• p^ •))q from 5 is (-p"^, x •)^. 

7. By (orig-m), there is Y with (•p'^, x ■)y {-p^, x •)^. By (e), ((• p^ '))/^ from 5 must be (-p^jX 
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Fig. 11. B's reasoning in CAP 



8. The fresh value x has thus been sent to Q hy Ia- It follows that in 2 and 3 above must be X = 

9. Since A controls Sa and Ia, and Sa & Q generated the response Hs^^x, only Ia could have sampled 
Hs^^x along the visual channel. 

10. Since A controls Ia and Ca, only Ia could have sent Hs^^x to Ca along the keyboard channel. 

These logical steps suffice to assure Bob that if he observes the local flow on the right in Fig. [Til then the 
non-local flow along the external boundary, all the way to the left side of the diagram and back, must have 
taken place. Comparing this diagrammatic conclusion with the pattern of (cr) on Fig. 9, we see that Bob 
has proven an instance of authentication. 

More explicitly, Bob's conclusion in Fig. [TTjis 

where we also added the trivial relay flows omitted from the figure, namely 

• {■ X (• X ■) on the way out, and 

• (• Hs^^x ^ (• Hs^^x on the way back. 

Again, it is clear that this conclusion is clearly an instance of (cr) and thus realizes authentication. □ 
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Fig. 12. B's view of A's round of device handshake 



Remarks. The reader may note that the store r, to which the response Hs x is assigned at run time, was 
present in Fig. [71 but ehded in Fig. [TT] This is a minor quirk here, but we wanted to adher to the custom 
estabhshed in the informal protocol analyses, and propagated, e.g. through the strand space notation: stores 
and program variables are by convention avoided, and denoted by the terms assigned to them at runtime. 
Although such a term, strictly speaking, does not have a static value, and denoting the stores ready to receive 
it when it is evaluated by its unevaluated expression is abuse of notation — it seems to be an eminently 
reasonable abuse, since it displays the path of the term during the run, whereas the names of all the local 
stores ready to receive it, only conceal its path. So the diagrammatic convention wins the day. 

Honesty assumption. It would be easy and natural to eliminate the assumption that Sa is honest by 
storing also at Ia, and including it into the response. That would reduce the above reasoning to "Sa 
has been on the path of the message" and added a separate thread " Ia has also been on the path of the 
message" . We chose to present the above version as slightly more informative, albeit slightly weaker. 

Does Bob need to be authenticated? In practice, the attackers usually impersonate Bob, to steal Alice's 
credential and use that to impersonate her to the actual Bob, the bank. The most frequent form of that 
attack is, of course, phishing. Two factor authentication is devised to avoid that: here Alice does not give 
her credentials to Bob, but to the smart card reader. Why is that better? How does Alice know that reader 
i?'s request for PIN is authentic? She sees on the visual channel that the only card in R is Sa, which she 
had put there herself. This is a simple but typical example of use of an authentic channel. 

5.4. Device pairing by handshake 

For the final example, we return to the device handshake procedure. For reasons of space, we omit proof of 
the theorem, but present that diagram that is used to supply the proof. 

Theorem. Upon the completion of a run of the procedure described in Sec. 13.5.31 Bob can be sure that his 
and Alice's device share a key, provided that he knows that the following assumptions are valid: 
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(a) 
(b) 
(c) 
(d) 
(e) 




(•/•) 



] ■)g — > (• [aq] •)p, i.e. the visual channel satisifies (auch.m.2) at least for events; 

■ X £ ^Da ^ ^Dbj Qa distributes the same value to Da and Db', 
i^[x]s {: X :)g t ■)q^ At — X, i.e. Qa is honest ; 

■ okoA A (: /-^ :)^^ ^ (: :)^^ ^ {■^■)da^ i-^- honest; and 



ok/j^, i.e. Dg is honest. 



Bob's formal reasoning is displayed on Fig [121 



6. Conclusion, discussion, and some last minute philosophy 

Summary We have presented a logical framework for reasoning about security of protocols that make use 
of a heterogeneous mixture of humans, devices, and channels. We have shown how different properties of 
channels and configurations can be expressed and reasoned about within this framework. A key feature of this 
framework is that it supports explicit reasoning about both the structure of a protocol and the contributions 
made by its various components, using a combination of diagrammatic and logical methods. Because of this, 
we believe that our approach can be particularly useful in giving a more rigorous foundation for white-board 
discussions, in which protocols are usually displayed graphically. By annotating the diagram with the proof 
using the methods demonstrated in this paper, formal reasoning could be brought to bear at the very earliest 
states of the design process. But we also believe that it should be possible to develop tool support as well, 
in which a proof engine would execute the logic, and the proof itself would be demonstrated on a graphical 
template. Both avenues are a question for future research. 

Post hoc ergo propter hoc? Although we speak of complete runs, as specified in Def. 13.51 and the 
information flows seem completely displayed in such runs, it is always possible to raise the question whether 
a demonstrated temporal order of events reflects their causal order. Can it be that Bob has received the 
same message that Alice had sent, but that by some coincidence Carol had sent an identical message, and 
that Bob actually received Carol's message? If the two messages are really identical, how can we tell? 

Should we try to distinguish the causal connections from temporal coincidences? More specifi- 
cally, would it be possible to refine PDL by working with statements that would specify not just the order 
of events, but also which interactions occurred through which channels? The idea would thus be to discern 
whether Bob has received Alice's or Carol's message by following the messages as they travel from channel 
to channel, and seeing which one goes where. 

We believe that this would just complicate logic, without bringing essentially more information. Even if 
we could follow two messages, with the same payload say mi and m2 on their respective paths across the 
network, could we be sure that their paths never cross? Networks are busy places, a hop from node to node 
may conceal many intermediary steps, and mi's hop ai — > bi may pass through an invisible node x at the 
same time as m2's hop a2 &2 passes through x. If that happens, then mi may emerge at 62 and m2 at 
bi. Or the other way around. We will never know. It will remain uncertain which of the two messages has 
reached Bob in the end. 

Logic has no business chasing the ghost of causality. The best we can do is describe the order of the 
events. Partially ordered multisets will remain a robust foundation for reasoning about network interactions 
and procedures for many applications to come. 

Can we really model social interactions and computations within the same model? People are 
not computers and society is not a computer network. How could I ever predict the behavior of a social 
group, when I am unable to predict the behaviors within my own family? Sometime I don't even understand 
my own behavior. Humans are hopelessly irrational. End of the story. 

But science has strange tricks. It is unable to model the trajectory of a single particle in the air, but it 
models hurricanes and predicts weather. Similarly, the polling experts and the web advertisers have developed 
impressive techniques to predict and influence the behaviors of large groups of people with a significant 
precision, although noone seems to be able to predict or influence what any particular individual will do. 

Most sciences try to model reality. But people do not obey models, groups of people do not obey models. 
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SO the sciences concerned with people and with social groups largely took exception to the modeling task. It 
turns out, though, that the task may become easier when social networks are interleaved with networks of 
computers and devices, and when people are modeled together with other social and computational actors. 
Can it be that models better fit people because people better fit models? 

Acknowledgement. The first author would like to thank Wolter Pieters [61] for introducing him to Bruno 
Latour's ideas about actor-networks. 
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